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Abstract 



Hierarchical Bayes procedure were compared for estimating item and ability parameters 
in item response theory. Simulated data sets from the two-parameter logistic model were 
analyzed using three different hierarchical Bayes procedures: the joint Bayesian with known 
hyperparameters (JBl), the joint Bayesian with informative hyperpriors (JB2), and the 
marginal Bayesian with known hyperparameters (MB). MB yielded consistently smaller root 
mean square differences than either JBl or JB2 for item and ability estimates. The maximum 
a posteriori estimation used along with MB yielded larger biases then the joint Bayes modal 
estimation in JBl and JB2. As the sample size and test length increased, the three Bayes 
procedures yielded essentially the same result. 

Key words: Bayes estimation, hierarchical prior, item response theory, joint Bayesian 
estimation, marginal Bayesian estimation. 



Introduction 



A romxnon situation in item response theory (IRT) is that in which both item and ability (i.e., 
structural and incidental) parameters have to be estimated simiiltaneously. When this is the 
case^ Bayraian estimation may be preferable to maximum likelihood estimation. Bayesian 
methods yield item discriniination parameter estimates which never become infinite; lower 
asymptote estimates of item characteristic curves which do not have implausible values; 
and ability estimates which are automatically r^tricted to a reasonable range (Lord, 
1986). Although Bayes procedure have been available for some time, the propertiu of 
these techniques have not been studied as thoroughly as those of maximum likelihood 
methods* The purpose of this study, therefore, was to compare different Bayes procedures 
for estimation of item and ability parameters in IRT. 

Bayesian approaches in IRT can be distinguished on the basis of whether estimation 
of item parameters is done with or without marginalization over ability parameters* If 
marginalization is used, the solution is marginal Bayesian estimation; if marginali^ation is 
not used, the solution is joint Bayesian estimation, 

Swaminathan and Giiford (1982, 1985, 1986) developed the joint Bayesian procedures 
for the one-, two-, and three- parameter item characteristic curve models. Their methods 
implement the hierarchical Bayes procedure for the specification of prior beliefs following 
the approach taken by Lindley (1971) and Lindley and Smith (1972). Evidence presented by 
Swaminathan and Giiford indicated that joint Bayesian parameter estimates were superior 
to those obtained via joint maximum likelihood estimation in that they remained in the 
parameter space, had smaller mean square differences from the underlying values, and were 
less biased (Gifford k Swaminathan, 1990), 

Mislevy ( 1986) employed the hierarchical Bayesian estimation model of Lindley and Smith 
(1972) to e3ctend the marginal maximum likelihood approach to a marginal Bayesian solution. 
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This permitted prior distributions to be posited for item parameters. Supplementary 
Ba3rc»ian proc^ures can also be used to obtain ability estimate once the marginal Bayesian 
estimate are obtained for item parameters. Mislevy and Bock (1989) implemented this 
Bayesian approach in the BILOG computer program* Tsutakawa and Lin (1986) also 
proposed a mi^ginal Bayesian ^timation to compute the posterior mode using the EM 
algorithm. 

Evidence has been subsequently presented which points to the likelihood that marginal 
modes may provide better approximations than joint mod^ to posterior means when 
nuisance (i.e., ability) parameters are present (Mislevy, 1986; O^Hagan, 1976; Tsutakawa 
& Lin, 1986). As yet, however, no empirical analyses have been reported which test this 
point. 

Bayesian approaches are characterized by incorporation of prior information or beliefs 
into the estimation of parameters in order to improve the accuracy of those estimates. 
Specification of priors in Bayesian analysis is a subjective matter. A number of different 
forms of priors have been studied (e.g., Leonard Novick, 1985; Lord, 1980;, Mislevy, 
1986; Mislevy k Bock, 1989; Swaminathan k Gifford, 1986; Tsutakawa & Lin, 1986). The 
terminology describing the structure of priors can sometimes be quite confusing. In a classical 
Bayesian approach, a single prior can be selected for the ordinary parameters. It is possible 
to recognize some uncertainty in priors. When priors are expressed in terms of family or class 
of prior, we call the parameters in the class of priors as hyperparameters, Hyperparameters 
describe the distributional characteristics of the prior information. It is sometime also 
convenient to specify prior information on the hyperparameters as well This second prior is 
called a hyperprior and contains parameters which are referred to as hyperhyperparameters 
(Good, 1980, 1983; Lindley, 1971, Lindley & Smith, 1972). 

To completely exploit the potential of the Bayesian estimation requires understanding 
of its mathematical underpinnings, particularly the role of prior distributions in estimating 



parameters. In the present study, we compared the effectiveness of three hierarchical B&yes 
procedures for obtaining item and ability Mtimatcs: the joint Bayesian estimation with 
known hyperparameters (JBl), the joint Bayesian estimation with informative hyperpriors 
(JB2), and the marginal Bayesian estimation with known hyperparameters (MB). 

In the following sections, we present a discussion of joint and marginal Bayesian 
estimation in IRT, Included is a presentation of prior and posterior distributions focusing 
specifically on one- and two-stage hierarchical priors. Finally, we present a disciission of the 
two joint Bayesian methods considering the specific priors dealt with in this paper. 

Background 

The Model 

Item characteristic curve models are expressed as mathematical equations of the probabilit*' 
of a correct response to a test item as a function of the ability of the person responding. 
Consider binary responses to a set of n test items by a set of iV examinees. A response of 
an examinee t to an item j is represented in these models by a random variable i/.^, where 
i = 1, . . . , and j = 1, . . . , n. The probability of a correct response to item j is represented 
by 

mi - = w). (1) 

and the probability of an incorrect response is given by 

PiUii - m,Lj) = QA^i). (2) 

depending on a real«valued ability parameter Bi, and a real- or vector-valued item parameter 



The item characteristic curve of the three-parameter modeP is given by 

m) = c, + (1 - c,) [1 + exp - b^)}]" , (3) 

where is the item discrimination parameter, bj is the item diiBculty parameta, is the 
lower asymptote of the item characteristic axrve for the item j, and 9i is the ability parameter 
of the person i. 

Likelihood Function 

Under typical testing conditions, a sample of examinees are drawn at random froiL. a 
population of examines possessing the underlying ability* No assumption is necessary as 
to the distribution of the examinees over the ability continuum (Lord & Novick, 1968). For 
each examinee there is a vector of dichotomousiy scored item responses of length n denoted 
by = {Uiij . . . , UjnY' One such vector exists for each of the N examinees. The resulting 
y X n matrix of item responses is denoted by U. 

Under the local independence assumption, the probability of given ability di and item 
parameters ^ is 

where ^~ {(^,- - ■, l^)'- If £ is the vector of the .V examinee trait scores, £ = (^i, • • • , ^jv)', 
the joint probability of U given by 9_ and ^ can be written as 

pwil) - n n p/Af-QA&^r^'^- (5) 

When we make inferences about both ability and item parameters from the observed 
data u of the x n matrix of item responses, the probability of u given by 9_ and ^ is 

^Because of inciusiveness, that is, the one* and two-parameter item characteristic curve 
models are regarded as the special cases of the three-parameter model, all expr^sions are 
developed below only for Birnbaum^s three- parameter model (Bimbaum, 1968). 



The likelihood, f(£, is a function of the parameters of the n item characteristic curves and 
the N abilities. 

Parameter Estimation in IRT 

The four main approaches currently used in IRT for parameter estimation are (a) joint 
maximimi likelihood estimation, (b) joint Bajrraian estimation, (c) maq^al maximum 
likelihood estimation, and (d) marginal Bayesian estimation. The following discussion 
presents a description of the Bayesian procedures as the extMisions of the maximum likelihood 
methods where the priors are posited for the item and ability parameters. 

The joint maximum likelihood estimation (Bimbaum, 1968; Lord, 1980; Wiagersky, 
Barton, & Lord, 1982) simultaneously maximizes the likelihood function 10,0 in Equation 
6. 

The joint Bayesian estimation (Swarainathan k Gifford, 1982, 1985, 1986) simultaneously 
maximizes the posterior distribution 

T(£^!u)cx^(£f)7r{^y, (7) 

where oc denotes proportionality and t(£,^) is the joint prior density of the parameters £ and 
Equivalently, the posterior distribution of parameters given the matrix of observations u 
is written as 

where m(u) is the marginal probability density function of u defined as 

Mu)=^J^J^i{e^iMii)didi (9) 

where 0 and E are the parameter spaces for ability and item parameters, rspectively. The 
posterior density function is a revised expression of the belief one has about the parameters 
once the data have been collected. It contains all the information necessary for making 
probability statements regarding the parameters of interrat. 



The marginal maximum likelihood estimation of item parameters (Bock ic Aitkin, 1981; 
Bock& Lieberman, 1970; Harwell, Baker, k Zwarts, 1988) maximize the marginal likeUhood 
function 

^iD^UL^idiiM^m. (10) 

where Tr{9i) denotes a prior distribution of ability and 

^(fni) - n mr^QA^^Y"^' = p(uii^Mi). (11) 

Supplementary maximum likelihood estimation and Bayesian estimation procedures can be 
used to obtain ability parameter ^timates, 

Bayesian priors on item parameters may also be used in the marginal maximum likelihood 
estimation to obtain the marginal Bayesian estimation of item parameters (Harwell & Baker, 
1991; Mislevy, 1986). The marginal Bayesian estimation maximizes the marginal posterior 
distribution 

7r(iiu) a m(J)7rie), (12) 

where Tn{() is the marginal likelihood function and it{^) is the prior distribution of item 
parameters. 

Prior and Posterior Distributions 
Prior Distribution 

A Hexible family of prior distributions is available by transforming item parameters to 
new parameters which may be taken to possess a multivariate normal prior distribution. 
To this end Leonard and Novick (1985) and Mislevy (1986) recommend the following 
transformations: 

a;=lna, (13) 

and 

7, = In{c,/(l-c,)}. (14) 
6 
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Since bj is a difficulty parameter, we also use the following expr«sion: 

= h' (15) 

In order to define the posterior distribution precisely, we first specify the prior belief 
about the parameters. We assume $ and ^ priors which are independently distributed with 
probability density functions 7r(£) and 7r({), respectively. 

Since we use the three-parameter model, 

i = (ai,A,7i,.-.,an,/?n,7ny. (16) 

We assume the vector of item parameters possesses a multivariate normal distribution 
conditional on the respective mean vector fi^ and covariance matrix S^. This prior 
specification is more general than previous suggestions in the literature. The prior 
distribution of item parameters is 

T(fi2) = (2^)-""' fef'^exp - M,)'£f '(( - H,)} , (17) 

where the hyperparameter ^ = (/i^, E^). 

If we assimie the vectors of the parameters a - (a^, . . . , a„)', 0 = ... ,/?„)', and 
7 ~ (7i»---.7n)' to be independent, we can take the vectors a, /3, and 7 to possess 
independent multivariate normal distributions, conditional on their respective mean vectors 
and and covariance matrices and (Leonard k Novick, 1985). The 

prior distribution of item parameters in this case is 

^iLh) = <2LlV^>i0\ri^H2\%), (18) 
where 2^ = (^,5^). Rs = il^M % = (/f,,^), 

= (2^)-'*^^ l^C'" exp { -i(fi - i^Yl^\a, - if^)} , (19) 
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and 7r(^i2^) and 7r(^j2^) are defined similarly. 

If we further assume exchangeability for all three parameters, we may take = 
if^ = fM0ly - H^l, ^ (^lln, S3 cr|I„, and - (7^In, where /x^, fig, fiy, aj, aj, and 
(T^ are scalars, 1 is an n x 1 vector of on», and In is an identity matrix of order n (Leonard 
k Novick, 1985). The prior distribution of item parameters, assuming exchangeability, is 

^(£!!l) = n ^(^.iM«,^;)^{/5;!/i0,(73>(7,V^,<^?), (20) 

where 

^{a,\fi^.al) = (2:r<r^)-^/'exp|-5~{a, - , (21) 

and ^i/3j\fi0^(T^) and '^{yjlfiy^ (^l) are defined similarly. This form of the prior distribution of 
item parameters is used in the present study for the joint Bay^ian estimation as well as for 
the marginal Bayesian estimation procedures. A hierarchical Bayes approach is developed 
below in which another stage priors are assigned to the prior parameters, /ia, Mj^, M-ri 
and cr*. 

Hierarchical Approach 

We can specify prior distributions for the parameter vectors £ and ^ in two stages. This 
type of prior distribution is a hierarchical prior (Berger, 1985; Good, 1983) also called a 
multistage prior (Lindley, 1971; Lindley k Smith, 1972). The idea is that one may have 
structural and subjective prior information at the same time and that it is often convenient 
to model this in stages. 

The structural knowledge that the 9^ are independent and identically distributed leads 
to the first stage prior description 

Tu^) = n^o(^0- (22) 

The subscript 1 on ti is to indicate that this is the first stage. The hierarchical approach 
then places a second stage subjective prior on Tq. If we use P to denote a class of priors, the 




hierarchical approach is most commonly used when the first stage, T , consists of priors of a 
certain functional form. Thus, if 

Te ~ {^iC^jr) : tti is of a given functional form and r € T}, (23) 

then the second stage would consist of putting a prior distribution, jr,{r), on the 
hyperparameter r. Such a second stage prior is sometimes called a hyperprior (Bcrger, 
1985; Good, 1983). 

The structural assumption of independence of the 9i, together with the assumption that 
they have a common normal distribution (i.e., we assume that the information on these 
parameters is exchangeable), leads to 

Ftf = |7r:(£lT) : 7r,(£|r) = J] Tro{9,), ttq being .V(^, a}), -oo oo and <r J > o| , (24) 

where 

f(«|r) = n(2T<r|)-'/'exp (--L(tf. - . (25) 

Similarly, assumptions that item parameters are independent and identically distributed 
and that the information on each of the item parameters is exchangeable lead to T^, T^, and 

with the hyperparameter v. Then the first stage prior distribution of item parameters 
assuming independence and exchangeability is 

n 

= n^»("ji/'-.^aK(/^7l^^.'^0>l(7;X,^^). (26) 

The complete prior for the hierarchirx.! mode), assuming independence between ability 
and item parameters, is 

^ilz,l,u) = ^i(^ll)'r2(l)iri(^l2)7r2(2), (27) 

where tti{8\t) is the first stage density of 9 conditional on r which takes the second stage 
density ^2(7) and iri(^j£) is the first stage density of ^ conditional on ^ which takes the 
second stage density ^2(11). 
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Second Stage Prior 

Nomnformative priors are often used at the second stage because of the difficiilty in specifying 
second stage priors (Berger, 1985). Sometimes, it is simply assumed that hjrperparameters 
are known. For example, in the joint Bayesian ^timation procedures, identifying restrictions 
can be incorporated directly into the prior (Swaminathan & Gifford, 1986) because the three- 
parameter model does not need to be identified. Therefore, we set /zj = 0 and cr| = 1, so 
that 

MHz) = i2irr^^' «^ ("5 ^ ^? ) • ^28) 
In the above specification, setting fie = 0 and <rj = 1 contains the explicit assumption that 
the hyperparameter r is known. 

In the present study, we use the identical form of prior for each of the item parameters. 
Detailed examples, therefore, are given below only for the transformed item discrimination 
parameter. Hyperpriors for fia and can be specified by assuming that /!„ and crj are 
indep«ident, /i^ has a noninformative uniform distribution, and has an inverse gamma 
distribution with parameters and A^, IQ{i/a,^a)- That is, 



(29) 



where i/^ > 0 and Xa > 0- Since E{ij;;^'^ = u^Xct, we consider as a prior variance 
estimate and 2i/a as a prior sample size lor the variance of item discrimination (Leonard, 
1972; Novick, 1969). The prior for a can be expressed as 

n 

T^^{SL\riJMllJ = n M<^3)-^2{V^) (30) 



The above expression depends on the nuisance parameters, fi^ and o^. The»e can be 



10 

13 



integrated out to yield 



(32) 



Therefore, 



(33) 



Similar prior spediications yield 



and 



-(n+2*S-l)/2 



(34) 



(35) 



where ^ = 1 S?=i /^i and 7 - ^ E?=i 7,- 

In the context of the hierarchical approach {Goel, 1983; Goei & DeGroot, 1981), we can 
illustrate the above specification of priors of item parameters ais 



(36) 



where 7^2(71) is viewed as 

^2{u) ^ ^2Ari^\^'^)ir2,2{v^'^)' (37) 

It can be seen that 77 = ijj^^hvj'^^). We integrate out the nuisance parameter ij^^^ explicitly 
assuming r^^^ is known: 

From the assumption of independence of the respective vectors of item parameters, 



(39) 



For the tr^sformed item discrimination parameter, foi example, 



2a = (2i".2i") = (/^.<'^;''..>.) 



2. 



(40) 
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We can integrate out from the prior dislributioa to yield 

'^i(a!2.)T2(2^)rf!4'> = ^(fil!4'') = ^i^Wa, (41) 
Posterior Distribution 

Ba3resian analysis is performed by combining the prior information and the sample 
information into what is called the posterior distribution; all decisions or inferences are 
made about the parameter of inter«t from the posterior distribution. The joint posterior 
density of 8 and ^, given observations u, t, and rj, is 

^{li\u,r,v) cx l{tiM§}T)iT{i\ri). (42) 

When ignorance (i.e., noninforir.ative) priors are assigned to the hyperparameters, r and 
t;, the posterior evaluation will be based largely on the data. Th^s will provide Stein-type 
shrinkage estimates for the item and ability parameters, smoothing each of these toward 
respective average values (Leonard k Novick, 1985). When the hyperparameters are assumed 
to be known, the simultaneous maximisation of the joint posterior results in JBL 

In JB2, the following joint posterior distribution will be simultaneously maximized to 
find the joint modal estimates: 

^(i£,i!u,T,£^=)) oc l{li)mzH§\v}'^ (43) 

In the marginal Bayesian estimation context (Harwell k Baker, 1991; Mislevy, 1986), 
assuming the hyperparameters we known, the examinee parameters 6, are integrated over 
their distribution to obtain the marginal posterior distribution 

7r{^|u,T,7j)ocm(ijr)7r(^l7). (44) 

Marginal Bayesian modal estimates of item parameters can be found by maximizing the 
marginal posterior distribution. 

12 
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When the two-sta^e hieraxchical priors are employed in the marginal Bayesian atimation 
of item parameters, assuming the ability hyperparameter r is known, the marginal posterior 
distribution can be defined as 

^ii\n,r,ri^'^) oc m(£iT)7r(^|2<'>). (45) 

In the marginal Bayesian estimation procedures, ability parameters are estimated after 
obtaining the item parameter estimates assuming these are true values. Two Bayes methods 
are available; Bayes modal estimation and Bayra expected a posteriori (EAP) ^timation 
(Bock & Mislevy, 1982). 

Since detailed mathematical derivation can be found for the marginal Bayesian estimation 
procedures (Harwell & Baker 1991; Mislevy, 1986), in the next section we presents only the 
two joint Bayesian estimation proc«lures. 

Joint Bayesian Estimation 

JB2 Estimation 

In order to estimate the item and ability parameters, the log posterior distribution 
hifl-(£^}u,T,2^^>) is to be maximized by taking partial derivatives with respect to the 
parameters and setting them equal to zeros. A procedure such as the Newton- Raphson 
method is then used to obtain the joint modal estimators. 

Since the parameters for all n items and the abilities for all N examinees are unknown, 
we first to take derivatives of the logarithm of the posterior distribution with respect to these 
parameters. These are then set equal to zero and the Zn + N simultaneous equations solved 
to obtain the Bayes modal estimates of the unknown parameters. Assuming item and ability 
parameters are independent, we can obtain joint Bayes modal estimates via Bimbaum's 
(1968) method. 
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In Bimbaum's method the item parameter estimation part and the ability parameter 
estimation part are repeated iteratively until a stable set of item and ability estimates is 
obtained. In the item parameter estimation part, the Newton-Raphson (Kennedy & Gentle, 
1980) equation is 

f^" = |5-'>-{H5-"}-'jj-" (46) 

where s indexes the iteration, f, is the gradient vector, is the Hessian matrix of the log 
posterior distribution, F - hi7r(£j^]u,r,7^f^5). The Newton-Raphson equation of the ability 
parameter estimation part, for examinee i, is 



We take a partial derivative of the log posterior distribution with respect to each item 
parameter, for example Oj, and set to zero. The resulting equation becomes 

^ In 1(9,0 + l-lnirU\rj^'^) = 0. (48) 
oaj " (Jay — 

Similarly, when we take a partial derivative of the log of the posterior distribution with 
regard to an examinee's ability parameter^ and set to zero, the resulting equation is 

Aln/(g,^).Ab.(iir) = 0. (49) 

In the subsequent sections, we derive the individual elements which ore needed in the 
Newton-Raphson method for the joint Bayesian-2 estimation procedure- 
Likelihood 

Taking logarithms, the log likelihood function is 

in KIO = E E K In {P,{Bi)} + (1 - In {Q^(^.)}1 . (50) 
.=1 j=i 
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First, we need partial derivatives of Pj{e,) with respect to each item parameter. The partial 
derivatives of Pj{$i) with respect to a^, I3j, and 7^ are 

^PA^i) = exp(a,){l ~ *(7,)}{^, - Pi)PmQWn (51) 

J^PA^i) = ~exp(a,){l - <i{li)}P;{9i)Q){e,), (52) 

and 

= - nimm. (53) 

where = [1 + exp{-exp(a,)(^i - and Q;{f.) = 1 ~ P;(l?.). Using these 

expressions and the relationship 



the derivatives of the log likelihood with respect to the item parameters are 



(54) 



and 



where 



^in/a^) = exp{a,){l - *(7;)} D^." - - (55) 

7 is! 

— b/(£i) = -exp{a,){l - - -P,(^.)}, (56) 

^ln/(£,i) = *{7,)i;{P,(«?.)}-Mu., - P,(^.)}, (57) 



The partial derivative of with respect to #j is 



(58) 



0 

qqPA^.) = exp(a,){l ~ %ii)}P;{e.)Q][9i) (59) 
and hence the derivative of the log likelihood with respect to the ability parameter is 

4 In = f; exp(a,){l - *(7,)}ti;i,{«„ - P,(i?.)}. (60) 
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Second Derivatives of the Likelihood 

The Newton- Raphson procedures require the second derivative of the log posterior 
distribution with respect to each parameter. Following standard practice (Finney, 1971; 
Rao, 1973), the expectations of the second derivatiws of the log likelihood for respective 
item parameters are 

E = -exp(2a,){l - *(7i)r " (61) 

£?|^ln/(^i)| = -exp(2Q,){l - ^(7i)r E^o^*(^OP;(^i). (62) 
£?|^ln/(£i)| = -{^(7,)F{1 -*(7,)}E{^.(^0}-^Q;(^.), (63) 

i?|;52S^^n/(£,^)| = exp(2a,){l - ^(7;)}' " ^.^^/(^.WC^.). (64) 

- -exp(a,)*K){l -*{7,)}'E(^»-/?.)^..'?;(^.). (65) 

£?|^^Ii^^m0j =ex?(a,)>P(^,}{l - n^,)?j:w.,Qm. (66) 

The expectation of the second derivative of the l^^g likelihood with respect to the ability 
parameter is 

£|^ln/{£,^)| = -f:exp(2aO{i - *(7,)}^u;.;?;(^.)Q;(^0- (67) 
Derivatives of Priors 

The logarithm of the prior of the item parameters is 

ln:r(^;7/'^^) = In ^(ajz/a, A^) -r In T(^!i/a, A^) + lnrr(2|i/^, A^), (68) 

where 



(69) 
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lnT(gl.„A,)«-(!L±^)lnj 
The partial derivative of the log prior of the item parameters with respect to is 



(70) 



(71) 



and similarly^ 



and 



where 



and 



5 1 



5 1 



^2 ^ i+s^iifizf)! 

n + 2*/« - 1 



(72) 



(73) 

(74) 

(75) 
(76) 

(77) 



Second Derivatives of Priors 



The second derivatives of the log prior of the item parameters are 



and 



— ln^(alt'a,Aa) oc 



5j™2(a, -Q)V(n + 2i/.-l) 



d\ , , i^-i)4- 2(/?i - ft)V{n + - 1) 
--^ln7r(2iv^,Ae) oc , 

d\ , , (l-i) 3^-2(7>-7)V(n+2».,-l) 
— In t( 7 , A, ) oc . 



(78) 



(79) 



(80) 
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Since 

In irffllr^ nt — 



the partial derivative of the log prior distribution of ability parameters with respect to 9i is 

^lnir{0\r)<x^ei (82) 

CfUi 



and the second derivative is 



^ln7r(£!T)oc-l. (83) 



Initial Values for the Newton-Raphson Method 

The Newton-Raphson method typically requires close approximations to the solution as 
starting points. Initial values for these starting points may be obtained from the following 
equations (Baker, 1987; Swaminathan k Gifford, 1986): 

= In aS^U Infill. (84) 



V 



1 - n. 



' = ^. (85) 



'J 



and 

9?^ = In ( -Zr''^' , 1 , (87) 



where n. is the biserial correlation of the item ; and the item-excluded total score, Zj is 
the normal deviate Zj = ^'"^(1 - ?>), ^ denotes the standard normal cumulative density 
function, is the classical item difficulty (i.e., pj - S;!li"«j/^)> ^3 number of 

options in multiple choice item j . 
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JBl Estimation 

The difference between the two joint Bayesian estimation procedures lies in the form of the 
prior distributions. Since JBl also requires the Newton-Raphson method, we need partial 
and second derivatives of the log likelihood and log prior distributions. When we take a 
partial derivative of the log posterior distribution with respect to an item parameter, say (Xj, 
and set to zero, we obtain 

^In/(g,i) + 5^1n4£l2) = l>- (88) 

Since the partial derivative of log likelihood function is the same as one used in JB2 
estimation, we dispense with d«cription of the likelihood part and present the elements 
for the item priors. 

Derivatives of Priors 

The term ^ ln7r(^Ir;) represents the contribution of the item priors. The partial derivatives 
of InT(^j7^) with respect to a^, ^j, and 7; are 

^lnT(^l^):=-~(a,-M.), (89) 
A.\niriC,v) = ~~iP,-fi0% (90) 

and 

^ln^(^l2) = -l(7,-M,). (91) 
Second Derivatives of Prior 

The second derivatives of the priors for the item parameters are 

^lnTai2) = --L, (92) 
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Empirical Study 

In this section we present an empirical comparison of the three Bayesian methods. Data 
were simulated under the following conditions: (1) number of examinees (AT = 100, 300), (2) 
number of items (n = 15,45), (3) estimation (JBl, JB2, MB), and (4) prior condition (prior- 
ol, prior-OT, prior-a/Jr)- The sample sizes and the test lengths were selected to emulate 
the situation in which estimation procedures and priors might have some impact upon item 
and ability parameter estimates. The sample size and test length, were completely crossed 
to yield four situations. 

Three Bayesian estimation procedures were used: JBl is the joint Bayes modal estimation 
procedure with known hyperparameters; JB2 is the joint Bayes modal estimation procedure 
with informative hyperpriors; and MB is the marginal Bayes modal estimation of item 
parameters with known hyperparameters and the EAP estimation of ability parameters. 

Each estimation procedure had the three prior conditions: prior-at, prior-Qx, and prior- 
qi3t. The prior-aL condition used a loose prior for the transformed iiem discrimination; 
the prior-OT condition used a tight prior for the transformed item discrimination; and the 
prior-a/Sx condition used tight priors for both the transformed item discrimination and the 
item difficulty. The exact specification of the prior condition is presented in a subsequent 
section on the iteir and ability parameter estimation. 

Data Generation 

Using the two-parameter logistic modei, 

P,(^.)=:[H-exp{-a,{^.-6,)}]-\ (95) 
20 



dichotomous item response vectors were generated via the computer program GENIRV 
(Baker, 1982). Based on the usual ranges of item parameters for the two-parameter 
logistic model, the underlying item discrimination parameters were assumed to be normally 
distributed with mean 1.046 and variance 0.103, a, ^V(1.046, 0.103); that ii, a, -n. 
^V(0.0,0.09). The underlying item difficulty parameters are distributed normally with mean 
0.0 and variance 1.0, bj - ^(0, 1). 

For data generation purposes, an approximation based on histograms was adopted. Item 
discrimination and item difficulty parameters for the IS-item test were set to have three 
different values respectively For the 45-item test, each of the item parameters was set to 
have five different values. Item parameters used to generate the data sets are given in Table 
1 and Table 2 for the 15-item test and for the 45-item test, respectively. 

Insert Tables 1 and 2 about here 

The underlying ability parameters were matched to the item difficulty distribution. 
Hence, a normal distribution with mean 0.0 and variance 1.0, $, .V'(0, 1), was used to 
specify the underlying ability parameters. Table 3 shows the ability groups and the number 
of examinees in each ability group for samples of 100 and 300. 

Insert Table 3 about here 

For each of the factors of sample size and test length, four replications of the simulated 
data were generated. Since the two factors were completely crossed, a total of 16 GENIRV 
runs was needed to obtain the data sets for the empirical comparison. 

Item and Ability Parameter Estimation 

Each of the generated data sets was analyzed via the computer program BILOG (Mislevy 
& Bock, 1989) for the margmal Bayesian estimation and via the computer program 
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JBAYES, specifically developed for this study to provide the joint Bayesian estimates. In 
each estimation procedure, three prior conditions, prior-at, prior-OTi and prior-ajSrj were 
employed. Hence, for example, the generated item response data set for the first replication 
of sample size 100 and tat length 15 was analyzed by nine computer runs (three estimation 
procedures with three prior conditions). 

The default options of the computer program BILOG (Mislevy & Bock, 1989) provide 
the marginal Bayesian modal estimates of item parameters and the expected a posteriori 
estimates of ability parameters for the two-parameter model- In the prior-at, condition for 
MB, a lognormal prior with mean 0.0 and variance 0.25 was used, that is, Ina-^ •V(0, 0.25). 
This is, in fact, the default prior specification in BILOG for the two-parameter model. 
In the prior-OT condition, a lognormal distribution with mean 0-0 and variance 0.09, 
Ina^ ^ ,Vr(0, 0.09), was used. For the prior-a/?T condition, the same prior in the prior- 
ax condition along with a normal prior was used for the item difficulty with mean 0.0 and 
variance 1.0, 3^ .V'(0, 1). 

For JBl estimation %na JBAVES, ^ ,^"(0,0.25) was used for the prior-OL condition. 
For the prior-ax condition, Q; ^ A'(0, 0.09) was used. The prior-a/?T> ^scd .V'(0,0.09) 
and ~ .^(0, 1). For JB2 estimation, the mean hyperparameter was assumed to have a 
noninformative uniform distribution and the variance hyperparameter was set to have an 
inverse ganmia distribution. In the prior-ax, condition, the inverse gemma distribution with 
i/a 4 and = 1 was used for the variance hyperparameter of the transformed item 
discrimination parameters: al ~ X^(4, 1). The inverse gamma distribution with parameters 
:/a = 11 and = 1 was used in the prior-ax condition: Ii7(ll,l). Two inverse 

gamma distributions with parameters ^ II and \, = 1, and uq — 4 and 0/25 for the 
variance hyperparameters of the transformed item discrimination and of the item difficulty, 
respectively, were adopted for the priur-aJx condition: crj ^ 1^(11, 1) and ^ 1^(4,0.25). 

When the mean hyperparameter is assumed to have a fixed value, /x, then the specification 
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of the variacce hyperparameter by the inverse gamma distribution with parameters v and A, 
A), jrieids the parameter of interest which is distributed as a t with mean ft, variance 
4f, and degrees of freedom 2u, that is, T{2v,fi,^) (Berger, 1985). Therefore, for the 
transformed item discrimination, assuming the mean hyperparameter ;*« has a fixed value, 
specification of the hyperparameter of variance by the inverse gamma with »/« = 4 and 
Ad = 1 yields a transformed item discrimination parameter which is distributed as a t with 
mean fia, variance = 0.25, and degrees of freedom 2va - 8, that is, q, ~ T(8,/ia, 0.25). 
Similarly, the specification al ~ 1) implies ay - 7'(22,/ia, 0.09); and the specification 

(7| — Z^{4,0.25) yields I3j ~ T(8,;i;9,l). In the above illustration, because we assumed a 
noninformative prior for the mean hyperparameter, the specifications used in JB2 will not 
produce the same specifications of item hyperparameters used in the marginal Bayesian and 
the joint Bayesian-1 procedures. These specifications are 1^(4,1), ~ 1^(11,1), 

and (r| J^(4,0.25) and are similar to their counterparts in the MB and JBl estimation 
procedures. 

The EAP estimation was used in MB for the ability estimation via BILOG. Bayes modal 
estimation was employed in the ability estimation for both joint Bayesian procedures via 
JBAYES. All three Bajresian estimation procedures used a standard normal distribution as 
the prior for the ability parameters. 

Metric Transformation 

In parameter recovery studies, such as the present one. comparisons between two or more 
sets of estimates and the underlying parameters require that the item and ability estimates 
obtained from different calibration runs and their parameters be placed on a common metric 
(Baker & Al-Karni, 1991; Yen, 1987). Parameter estimation procedures under IRT yield 
metrics which are unique up to a linear transformation. To link both sets of estimates and 
parameters, it is necessary to determine the slope and intercept of the equating coefl5cients 
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required for the transformation. The estimates of the item and ability parameters for each 
of the estimation procedures were placed on the scale of the true parameters using test 
characteristic curve method by Stockirg and Lord (1983) as implemented in the computer 
program EQUATE (Baker, 1990). 

Criteria 

The empirical comparisons in this study involved three criteria: root mean square difF^enca 
(RMSD)y correlation, and bias. RMSD is the square root of the average of the squared 
differences between estimated and true values. For item discrimination, for example, RMSD 
is defined as 

The bias of a point estimator d is given by Ba ~ E{a) - a; the bias for item difficulty 
is given by B\, = E{h) — h\ and the bias for the ability estimator is defined by 89 = E{d) — B 
(Mendenhall, ScheaiFer, k Wackerly, 1981). For the 15-item test, (or 3%) was obtained 
with regard to the three diiferent underlying parameters across the four replications. For 
the 45-item test, Ba (or Bh) was calculated with regard to the five different underlying 
parameters across the four replications. The bias Bs was obtained for the 11 ability levels 
over the four replications. 

Results 
RMSD and Correlation Results 

RMSD and Comlaiion Results for Item DiscTimination. RMSDs of item discriminations for 
each data set are reported in Table 4. As sample size increased, RMSDs decreased; marginal 
RMSD means were 0.24924 and 0.20646 for sample sizes 100 and 300, respectively. 

Insert Tables 4 and 5 about here 
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MB yielded smaller RMSDs than either of the two joint Bayesian procedures. For the 
two joint Bayesian procedures, JBl yielded larger RMSDs. Increasing the number of items 
reduced the size of RMSDs, particularly for JBl and JB2. For the 15-itcm test, MB yielded 
smaller RMSD values although all three estimation methods produced nearly the same values 
for the 45-item test. RMSDs for the third replication of the sample size 100 and 15-item 
test were slightly smaller than for the other cases and RMSDs for the fourth replication of 
the sample size 100 wid 15-itera test were slightly larger than for the other cases. These 
differoices were probably due to sampling fluctuations in the data generation procedures 
used in this study. The effect of this probable sampling fluctuation could also be seen for 
the respective correlations in Table 5. 

When the loose prior was used in JBl and JB2, it yielded comparatively larger values of 
RMSD than did either of the tight prior conditions. This was particularly the case for the 
short 15-item test. 

The correlations between true and estimated values of item discriminations are given in 
Table 5. For each data set, the three Bayesian estimation procedures yielded practically 
the same correlations. Generally, the larger the sample sizes the higher correlations. Also, 
increasing the number of items tended to produce slightly higher correlations. For the three 
prior condition used, there seemed no definitive tendency observed in the correlations. 

RMSD and Correlation Results for Item Difficulty. Table 6 contains RMSDs for item 
difficulty. The pattern of results was nearly the same as that for item discrimination. An 
increase in sample size appeared to be associated with a decrease in the size of RMSDs. For 
JBl and JB2, increasing the number of items appeared to slightly decrease RMSDs. The 
values of RMSD from MB were nearly the same regardless of the test size. MB consistently 
yielded the smallest RMSDs. 

Prior-a^ condition yielded a relatively smaller RMSDs than did either the prior-aL or 
prior-otT conditions. MB consistently yielded smaller RMSDs than JBl and JB2 regardless 

25 

ERIC 



the prior condition employed. 



Insert Tables 6 and 7 about here 

For each data set, the three estimation procedures yielded nearly the same correlations 
between estimates and parameters (see Table 7). Generally, the larger sample sizes yielded 
higher correlations. Increasing the number of items tended to produce slightly highCT 
correlations* There seemed to be no definitive trends in the correlations among the three 
prior conditions. 

RMSD and Correlation Results for Ability. The RMSD results between ability estimates 
and the underlying parameters are reported in Table 8. As expected, RMSD values were 
much smaller for the 45-item test than for the 15-item test. Smaller values were consistently 
obtained for MB than for either of the two joint Bayesian procedures. The differences 
between MB and either of the two joint Bayesian procedures were particularly noticeable 
with the short test. As the number of items increased, the difference in RMSDs among the 
three estimation procedures appeared to decrease. 

Insert Tables 8 and 9 about here 

Prior conditions did not have an apparent impact on the size of RMSD values for ability. 
This might be expected as the prior conditions used ;vere manipulated only with respect to 
item parameters. 

The correlations between the ability estimates and the true values are reported in Table 
9. The correlations were nearly identical across the three estimation procedures for each data 
set. The 45-item test yielded higher correlations than the 15-item test. The prior conditions 
did not seem to aiFect the correlations between the ability estimates and the underlying 
parameters. As was the case with RMSD results, the prior used in the context of the item 
parameter estimation had minimal effect when ^timating ability parameters. 




Bias Results 

Bias Results for Item Discrimination. The bias results for item discrixnination, presented in 
Table 10, appear to refiect induence by a number of facto». Each bias statistic was obtained 
by combining all four replications tor,ether; that is, the nimibers of items used to obtain bias 
values were 16, 28, and 16, for a - 0.66, 1.00, ^d respectively, for the I5*item trat. 
For the 45-item t«t, 16, 36, 76, 36, and 16 items were used for a - 0.57, 0.76, LOO, L32, 
and L77, rrapectiveiy. 

Insert Table 10 about here 

For each test length, increasing the sample size resulted in a decrease in bias values. In 
general, positive bias valu» were observed for the smaller item discrimination parameters 
(i.e., a = 0.66 for the 15-item test, and a = 0.57 and 0.76 for the 45-item test) due to the 
regression toward the mean of the prior distribution. Negative values of bias were obtained 
for the relatively larger item discrimination parameters (i.e., a = 1.51 for the 15-item test, 
and a - 1.32 and L77 for the 45-item test). This shrinkage eifect can be observed for all data 
sets except when the loose prior on item discrimination (prior^aL) was used for the 15*item 
test. When a large sample size was used with 45-item test, all three estimation procedures 
yielded similar results. 

For the three different levels of item discrimination, both JBl and JB2 produced more 
positive bias for the 15-itera test than did MB. The two tight prior conditions, prior-ax and 
prior^a;^, yielded similar pattern of bias for all data sets. 

Bias Results for Item Difficulty. The bias results for item difficulty are reported in Table 
11. The pattern of results was somewhat different from that for itrai discrimination- For 
the 15-item test, the two joint Bayesian methods yielded negative bias valu^ for the easy 
items (i = -L38) and positive bias values for the difficult items (6 = 1.38). When both 
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priors on item difficulty and item discrimination were used, the same pattern was observed. 
Even though the test size and sample size increased, the same pattern was observed for three 
methods of estimation. MB yielded the smallest bias for all item difficulty levels in all data 
sets* 

Insert Table 1 1 about here 

Bias Results for Ahiltiy. The bias r^ults for ability from the 100-examinee*15-item data 
set are presented in Table 12. Those for the 100-examinee45-item, 300-cxaminee*15-item, 
and 300-examinee-45-item data sets are presented in Tables 13^ 1^, and 15, r^pectively. It 
can be seen from these tables that shrinkage was more evident when a snmll number of itenis 
was used. The prior conditions employed in item parameter estimation did not produce any 
difference among the bias results. The expected a posteriori estimation of ability employed 
in MB yielded consistently larger sizes of bias than the Bayes modal method used in the two 
joint Bayesian methods. JBl and JB2 yielded nearly the same pattern of bias for all data 
sets. JB2 yielded relatively smaller values of bias, however, then the other two methods. 
It should be noted that the bias values for the different ability levels were obtained by 
combining the four replications. 

Insert Tables 12, 13, 14, and 15 about here 

Discussion 

Maximum likelihood approaches in IRT suifer from a number of problems, an important 
one being the possibility that unreasonable valu« will be obtained for parameter estimates, 
particularly for item discrimination and pseudo-guessing. In addition, these approach^ 
perform poorly when estimating item and ability parameters for unusual response patterns 
such as all correct or all incorrect answers. These problems have led to interest in the 
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development of Bay^ian approach^ for estimation of item and ability parameters. In the 
pr^nt study, we used a recov^ study approach to compare parameter estimate obtained 
via a marginal Bajresian algorithm, MB, ^d two joint Bayesian algorithms, JBl and JB2. 

Analysis of item parameter recovery results indicated that MB yielded paruneter 
estimates which were generally better than those obtained from JBl or JB2. RMSD and 
Bias results for item discrimination and difficulty were smaller for MB estimate. JBl and 
JB2 estimates were similar although JB2 results wo^ slightly better. Th^e differences were 
primarily evident in the small sample and short test conditions. This superiority was likely 
due to the fact that MB permits item par^et^ to be atimated without the concurrent 
need to estimate ability. Differences due to sample size are interesting if only for the fact 
that the two sample sizes simulated in the present study, 100 and 300 exaniinees, were 
both relatively small. In reality, all three Bayesian methods performed wdl, yielding item 
parameter estimates which were not markedly different from the underlying values. Failure 
of the joint Bayesian methods to provide estimates as accurate as MB under these conditions 
should not be viewed as something that indicates a serious deficiencies for the joint Bayesian 
methods. Rather, what these results suggest is that margin^zed Bayesian solutions are 
relatively powerful under the somewhat extreme conditions simulated in the present study. 

The £AP ability ^timates obtained via MB were more accurate in terms of RMSD than 
those from either of the two joint methods. The bias values for EAP estimates, however, 
were larger than Bayes model estimates of ability for JBl ^d JB2. This is well-knoMm result 
and demonstrates the impact of the use of the posterior mean in the EAP estimation rather 
than the posterior mode (Bock & Mislevy, 1982). 

The effectiveness of the marginalization in MB may depend in part on the accuracy of 
the ability hyperparameters. Seong (1991) has shown that item parameter atimates from 
the marginalized distribution are sensitive to misspeciiication of the ability distributions. In 
this study we generated the ability had standard normal distribution. Consequently, the 
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margixiaiization of the posterior distribution was performed under an optiroal situation* 

Both the shape and the variance of the prior distribution play a part in the estimation 
of parameters. The more informative the prior (i.e,, the smaller the variance), the more 
the parameter estimate tends to be pulled toward the mean of the prior The tight prior 
conditions used in the present study, prior-ax smd prior-o^, yielded bettn item parameter 
estimates than did the loose prior, prior-aL- The use of tight priors s^ms appropriate 
when there is strong a priori information about the parameters. In the MB context, the 
misspecification of prior information has not been found to be a serious problem except 
when the mean of the underlying item discrimination parameters was quite smaller than the 
mean of the prior (Al-Karni, 1990). 

Incorrect specification of the prior may result in more serious consequences for JBI and 
MB than for JB2. This condition was not tested in the present study because priors were 
relatively well-matched to the generated data sets. 

Several issues remain to be studies in the present context. In particuhir, little has been 
done on the shrinkage e^ect except for Al-Karni (1990) and Gifford and Swaminathan (1990). 
Neither are the effects of priors well-known with respect to the robustness of two-stage 
hierarchical models. This kind of research is particularly valuable for small samples and 
short tests. Marginal Bayesian estimation was arguably the more desirable algorithm in 
the present study. Even so, it remains to be seen whether incorporation of a two-stage 
hierarchical procedure might improve marginal BayM modal estimates. 
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Table 1: Item Discrimination and Item Difficulty Parameters for 15-Item Test 



Item 


Discrimination** 


Difficulty 


1 


0.66 (.0.41) 


-1.38 


2 


0.66 (-0.41) 


0.00 


3 


0.66 (-0.41) 


0.00 


4 


0.66 (.0.41) 


1.38 


5 


1.00 (0.00) 


.1.38 


6 


1.00 (0.00) 


.1.38 


7 


1.00 (0.00) 


0.00 


S 


■ 1.00 (0.00) 


0.00 


9 


1.00 (0.00) 


0.00 


10 


1.00 (0.00) 


1.38 


11 


1.00 (0.00) 


1.38 


12 


1.51 (0.41) 


-1.38 


13 


1.51 (0.41) 


0.00 


14 


1.51 (0.41) 


0.00 


15 


1.51 (0.41) 


1.38 



Parentheses contain the transformed item 
discrimination- 



3.b 
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Table 2: Item Discrimination and Item Difficulty Parameters for 45-Item Test 



Item 


Discrimination" 


Difficulty 


1 


0.57 (-0.57) 


-0.95 


2-3 


0.57 (.0.57) 


0.00 


4 


0.57 (-0.57) 


0.95 


5 


0.76 (-0.28) 


-1.90 


6-7 


0.76 (-0.28) 


-0.95 


8-10 


0.76 (.0.28) 


0.00 


11-12 


0.76 (-0.28) 


0.95 


13 


0.76 (-0.28) 


1.90 


14-15 


1.00 (0.00) 


• 1.90 


16-18 


1.00 (0.00) 


-0.95 


19-27 


1.00 (0.00) 


0.00 


28-30 


1.00 (0.00) 


0.95 


31-32 


1.00 (0.00) 


1.90 


33 


1.32 (0.28) 


.1 .90 


34-35 


1.32 (0.28) 


-0.95 


36-38 


1.32 (0.28) 


0.00 


39-40 


1.32 (0.28) 


0.95 


41 


1.32 (0.28) 


1.90 


42 


1.77 (0.57) 


•0.95 


43-44 


1.77 (0-57) 


0.00 


45 


1.77 (0.57) 


0.95 



* Parentheses contain the transformed item 
discrimination. 
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Table 3: Number of Examinees at Each of the 11 Ability Levels 



Number of Examinees 



$ Levd N = m = 300 



-2.5 


1 


4 


-2.0 


3 


8 


-1.5 


7 


20 


-1.0 


12 


36 


-0.5 


17 


52 


0.0 


20 


60 


0.5 


17 


52 


1.0 


12 


36 


1.5 


7 


20 


2.0 


3 


8 


2.5 


I 


4 
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Table 4: Root Mean Square Differences of Item Discrunination 



Joint Bayesian-l Jomt Bayesian-2 Marginal Bayesian 



iV n r* cti Or a0i ai or ot0T ai or 



100 15 1 0.315 0.254 0.251 0.263 0.250 0.253 0.272 0.251 0.258 

2 0.282 0.231 0.211 0.245 0.229 0.214 0.253 0.208 0.225 

3 0.349 0.217 0.194 0.285 0.225 0.215 0.181 0.158 0.175 

4 0.332 0.337 0.295 0.301 0.294 0.291 0.313 0.290 0.296 



45 1 0.270 0.233 0.241 0.234 0.240 0.239 0.261 0.227 0.233 

2 0.241 0.233 0.233 0.239 0.249 0.250 0.252 0.240 0.249 

3 0.313 0.264 0.261 0.261 0.264 0.263 0.299 0.259 0.266 

4 0.225 0.209 0.206 0.215 0.228 0.228 0.206 0.197 0.204 



300 15 1 0.204 0,195 0.188 0.199 0.199 0.195 0.152 0.160 0.167 

2 0.329 0.184 0.174 0.195 0.179 0.173 0.178 0.169 0.176 

3 0.595 0.288 0.277 0.533 0.291 0.281 0.277 0.211 0.209 

4 0.755 0.231 0.228 0.260 0.229 0.228 0,212 0.191 0.189 



45 1 0.155 0.137 0.134 0.138 0.136 0.137 0.151 0.132 0.134 

2 0.203 0.189 0.183 0.188 0.180 0.181 0.199 0.182 0.182 

3 0.166 0.152 0.151 0.152 0.153 0.153 0.164 0.151 0.153 

4 0.206 0.179 0,172 0.178 0.171 0.169 0.208 0.171 0.174 



•Number of Examinees (AT), Number of Items (n), and Replication (r). 
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Table 5: Correlations Between Estimates and Parameters for Item Discrimination 



Joint Bayesian-l Joint Baye8iaa-2 Marginal Bayesian 



100 15 1 0.615 0.612 0.614 0.616 0.610 0.592 0.618 0.621 0.612 

2 0.770 0.745 0.772 0.771 0.740 0.774 0.748 0.748 0.703 

3 0.809 0.826 0.852 0.817 0.828 0.840 0.879 0.893 0.862 

4 0.383 0.362 0.385 0.389 0.372 0.388 0.423 0.423 0.400 



45 1 0.695 0.677 0.691 0.677 0.687 0.698 0.700 0.695 0.683 

2 0.674 0.678 0.685 0.679 0.676 0.683 0.665 0.659 0.641 

3 0.526 0.559 0.564 0.559 0.569 0.572 0,566 0.577 0.562 

4 0.742 0.752 0.765 0.757 0.773 0.771 0.783 0.796 0.796 



300 15 1 0.869 0.865 0.874 0.870 0.865 0.872 0.878 0.872 0.857 

2 0.865 0.860 0.870 0.860 0.869 0.874 0.846 0.845 0.831 

3 0.688 0.761 0.761 0.701 0.760 0.761 0.766 0.776 0.780 

4 0.574 0.767 0.766 0.758 0.765 0.750 0.784 0.798 0.804 



45 1 0.906 0.905 0.909 0.906 0.906 0.908 0.906 0.908 0.906 

2 0.821 0.813 0.819 0.815 0.822 0.820 0.817 0.820 0.819 

3 0.879 0.878 0.880 0.878 0.880 0.882 0.877 0.878 0.876 

4 0.843 0.843 0.848 0.845 0.845 0.848 0.850 0.854 0.850 



"Number of Examinees (iV), Number of Items (n), and Replication (r). 
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Table 6: Root Mean Square Differences of Item Difficulty 



Joint Bayesian-l Joint Baye8ian-2 Marginal Bayesian 



100 15 1 0.374 C.360 0.344 0.374 0.375 0.356 0.325 0.310 0.308 

2 0.379 0.361 0.327 0.381 0.389 0.359 0.259 0.263 0.248 

3 0.481 0.499 0.462 0.498 0.522 0.496 0.382 0.402 0.388 

4 0.346 0.337 0.310 0.344 0.355 0.333 0.292 0.290 0.286 



45 1 0.370 0.346 0.313 " 0.349 0.342 0.319 0.335 0.312 0.294 

2 0.314 0.306 0.298 0.316 0.319 0.308 0.304 0.301 0.299 

3 0.314 0.303 0.251 0.308 0.310 0.272 0.269 0.260 0.246 

4 0.330 0.308 0.274 0.314 0.315 0.289 0.282 0.276 0.272 



300 15 1 0.347 0.334 0.320 0.345 0.343 0.333 0.167 0.170 0.165 

2 0.330 0.301 0.283 0.316 0.304 0.292 0 172 0.174 0.174 

3 0.344 0.329 0.295 0.343 0.330 0.305 0.222 0.188 0.186 

4 0.213 0.203 0.192 0.222 0.211 0.198 0.133 0.J21 0.120 



45 1 0.157 0.189 0.174 0.192 0.193 0.180 0.158 0.153 0.152 

2 0.226 0.209 0.197 0.213 0.208 0.198 0.208 0.188 0.184 

3 0.262 0.255 0.236 0.257 0.257 0.242 0.232 0.228 0.228 

4 0.227 0.215 0.194 0.219 0.220 0.203 0.189 0.174 0.171 



"Number of Examinees {N), Number of Items (n), and Replication (r). 



o 
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Table 7: Correlations Between Estimates and Parameters for Item Difficulty 



Jcint Bayesian-l Joint Bayesian-2 Marginal Bayesian 



100 15 1 0.946 0.950 0.951 0.950 0.951 0.953 0.946 0.951 0.952 

2 0.976 0.975 0.976 0.977 0.974 0.976 0.976 0.975 0.973 

3 0.937 0.935 0.937 0.937 0.934 0.936 0.939 0.936 0.935 

4 0.955 0.957 0.961 0.960 0.960 0.961 0.957 0.959 0.959 



45 1 0.948 0.955 0.958' 0.958 0.961 0.962 0.949 0.957 0.959 

2 0.953 0.955 0.956 0.955 0.955 0.955 0.953 0.955 0.955 

3 0.970 0.972 0.976 0.972 0.972 0.975 0.970 0.972 0.973 

4 0.956 0.960 0.964 0.961 0.961 0.963 0.961 0.963 0.963 



300 15 1 0.993 0.993 0.993 0.993 0.992 0.992 0.993 0.993 0.992 

2 0.985 0.987 0.988 0.987 0.987 0.988 0.988 0.987 0.986 

3 0.968 0.976 0.979 0.970 0.976 0.979 0.975 0.982 0.983 

4 0.989 0.993 0.993 0.992 0.993 0.994 0.992 0.994 0.994 



45 1 0.988 0.988 0.989 0.988 0.988 0.989 0.988 0.989 0.989 

2 0.980 0.983 0.984 0.983 0.984 0.984 0.979 0.983 0.983 

3 0.972 0.974 0.976 0.974 0.974 0.976 0.973 0.974 0.974 

4 0.985 0.986 0.987 0.986 0.986 0.987 0.984 0.986 0.986 



"Number of Examines {N), Number of Items (n), and Replication (r). 
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Table 8: Root Mean Square Difference of Ability- 



Joint Bayesian-l Joint Bayaian-2 Marjpnal Bajresian 



100 15 1 0.501 0.492 0.501 0.514 0.512 0.520 0.456 0.455 0.454 

2 0.534 0.522 0.532 0.542 0.533 0.547 0.497 0.491 0.491 

3 0.570 0.554 0.565 0.572 0.565 0.572 0.537 0.536 0.536 

4 0.609 0.582 0.589 0.616 0.603 0.610 0.546 0.538 0.538 



4.«? 1 0.316 0.327 0.3?4 0.334 0.336 0.344 0.315 0.319 0.320 

2 0.352 0.350 0.360 0.362 0.360 0.376 0.342 0.341 0.341 

3 0.310 0.308 0.314 0.310 0.311 0.317 0.314 0.310 0.309 

4 0.298 0.302 0.308 0.309 0.308 0,317 0.286 0.291 0.292 



300 15 1 0.552 0.549 0.554 0.556 0.555 0.558 0.521 0.517 0 517 

2 0.552 0.551 0,556 0.557 0.559 0.562 0.521 0.522 0.523 

3 0.565 0.557 0.560 0.566 0.560 0.563 0.539 0.537 0.538 

4 0.582 0.548 0.553 0.557 0.560 0.568 0.498 0.498 0.498 



45 1 0.325 0.323 0.325 0.325 0.325 0.328 0.320 0.318 0.318 

2 0.337 0.339 0.344 0.344 0.345 0.349 0.326 0.328 0.328 

3 0.304 0.305 0.308 0.306 0.306 0.309 0,302 0.301 0.301 

4 0.339 0.339 0.341 0.342 0.342 0.345 0,330 0.329 0.329 



"Number of Examinees [N), Number of Items (n), and Replication (r). 
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Table 9: Correlations Between Estimates and Parameters for Ability 



Joint Bayesian-l Joint Baye8ian-2 Marginal Bayesian 



100 15 1 0.890 0.893 0.893 0.893 0.894 0.893 0.892 0.893 0.894 

2 0.868 0.871 0.871 0.870 0.872 0.872 0.870 0.874 0.873 

3 0.847 0.849 0.849 0.848 0.848 0.849 0.848 0.849 0.848 

4 0.849 0.852 0.853 0.853 0.853 0.854 0.851 0,853 0.853 



45 1 0.951 0.948 0.948' 0.948 0.947 0.947 0.950 0.949 0.949 

2 0.944 0.945 0.944 0.944 0.944 0.944 0.944 0.944 0.944 

3 0.951 0.952 0.952 0.952 0.951 0.951 0.950 0.951 0.951 

4 0.958 0.957 0.958 0.957 0.956 0.957 0.959 0.958 0.958 



300 15 1 0.854 0.855 0.855 0.855 0.855 0.855 0.857 0.857 0.857 

2 0.856 0.856 0.857 0.856 0.857 0.857 0.855 0.855 0.854 

3 0.837 0.844 0.845 0.838 0.844 0.845 0.845 0.845 0.845 

4 0.858 0.873 0.874 0.973 0.874 0.873 0.874 0.873 0.873 



45 1 0.948 0.948 0.949 0.948 0.948 0.949 0.948 0.949 0.949 

2 0.947 0.947 0.947 0.947 0.947 0.947 0.947 0.946 0.946 

3 0.954 0.954 0.954 0.954 0.954 0.954 0.954 0.954 0.954 

4 0.944 0.944 0.944 0.944 0.944 0.945 0.945 0.945 0.945 



■Number of Examinees (iV), Number of Items (n), and Replication (r). 
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Table 10: Bias Results for Item Discrimination 



100 15 0.66 
1.00 
1.51 



Joint Bayesian-l Joint Baye9ian-2 Marginal Baycsian 



at 

0.27 
0.06 
0.08 



45 0.57 0.20 

0.76 0.11 

1.00 0.07 

1.32 -0.04 

1.77 -0.25 

300 15 0.66 0.16 

1.00 0.O7 

1.51 0.33 



a0T oti on a/3T 



0.30 0.28 0.30 
0.04 0.03 0.05 
-0.13 -0.16 -0.03 



0.33 0.32 
0.06 0.05 
-0.18 -0.19 



0.19 0.23 0.22 
-0.02 -0.03 -0.05 
-0.07 -0.27 -0.30 



0.26 


0.24 


0.29 


0.32 


0.31 


0.16 


0.14 


0.14- 


0.17 


0.19 


0.18 


0.08 


0.04 


0.02 


0.04 


0.03 


0.02 


0.03 


0.13 


-0.13 


-0.16 


-0.20 


.0.19 


-0.04 


0.33 


-0.37 


-0.38 


-0.54 


-0.57 


-0.25 


0.20 


0.19 


0.19 


0.21 


0.21 


0.08 


0.07 


0.06 


0.07 


0.07 


0.07 


-0.02 


0.03 


0.02 


0.16 


0.02 


0.00 


.0.02 



0.23 
0.12 
0.01 
-0.15 
-0.44 



0.22 
0.10 
-0.01 
-0.19 
-0.46 



0.12 0.12 
-0.02 -0.03 
-0.15 -0.16 



45 0.57 0.06 

0.76 O.U 

1.00 0.03 

1.32 -0.01 

1.77 0.09 



0.12 O.U 

0.13 0.13 

0.02 0.02 

-0.07 .0.07 

-0.05 -0.09 



0.12 0.15 

0.13 0.14 

0.02 0.02 

-0.06 -0.09 

-0.06 -0.12 



0.14 0.02 

0.14 0.07 

0.02 0.01 

-0.09 -0.04 

-0.16 0.07 



0.08 0.08 

0.09 0.08 

.0.01 -0.01 

•0.10 -0.11 

-0.09 -0.09 



-Number of Examinees (iV), Number of Items (n), and Discrimination (a). 
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Table 11: Bias Results for Item Difficulty 



Joint Bayesian-l Joint Bayesian-2 Marginal Bayesian 



JV n 5" OL or a^r ol otr oi0T cti or a0^ 



100 15 -1.38 -0.19 -C.18 -0.14 -0.22 -0.24 -0.20 0.03 0.00 0.05 
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 
1.38 0.19 0.20 0.15 0.23 0.23 0.20 -0.01 0.00 -0.05 



45 -1.90 -0.10 -0.11 0.02 -0.16 -0.:<' -0.08 0.05 0.00 0.06 

-0.95 -0.13 -0.11 -0.09 -0.12 -0.12 -0.11 -0.06 -0.05 -0.01 

0.00 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.01 

0.95 0.11 0.10 0.08 0.12 0.11 0.11 0,05 0.05 0.00 

1.90 -0.01 0.01 -0.10 0.06 0.07 -0.02 -0.15 -0.10 -0.17 



300 15 -1.38 -0.28 -0.28 -0.26 -0.29 -0.29 -0.28 -0.01 -0.01 0.00 
0.00 0.00 0.00 0.01 0.02 0.01 0.01 0.01 0.01 0.01 
1.38 0.28 0.28 0.26 0.29 0.29 0.27 0.04 0.03 0.01 



45 -1.90 -0.09 -0.11 -0.04 -0.12 -0.13 -0.07 0.07 0.04 0.06 

-0.95 -0.15 -0.14 -0.14 -0.15 -0.15 -0.14 -0.07 -0.06 -0.04 

0.00 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 

0.95 0.04 0.04 0.04 0.04 0.05 0.05 -0.05 -0.04 -0.05 

1.90 0.17 0.16 0.09 0.17 0.18 0.12 0.00 0.01 0.00 



"Number of Examinees (iV), Number of Items (n), and Difficulty (6). 
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Table 12: Bias Results for Ability from 100- Examinee- 15- Item Data Set 



Joint Bayesian-l Joint Bayesian-2 Marginal Bayesian 



S Level 


ol 




0^ 




ai 






OTT 




-2.5 


0.17 


0.24 


0.18 


0.09 


0.12 


0.07 


0.55 


0.55 


0.56 


-2.0 


0.21 


0.24 


0.19 


0.16 


0.15 


0.12 


0.48 


0.47 


0.48 


-1.5 


0.12 


0.15 


0.12 


0.08 


0.10 


0.07 


0.32 


0.33 


0.34 


-1.0 


-0.12 


-0.09 


-0.12 


-0.14 


-0.14 


-0.15 


0.04 


0.04 


0.05 


-0.5 


-0.01 


0.00 


-0.01 


-0.02 


-0.02 


-0.03 


0.05 


0.05 


0.05 


0.0 


-0.05 


-0.05 


-0.05 • 


-0.05 


-0.06 


-0.06 


-0.04 


-0.05 


.0.04 


0.5 


-0.03 


-0.04 


-0.03 


-0.02 


-0.03 


-0.02 


-0.09 


-0.10 


-0.10 


1.0 


-0.04 


-0.05 


-0.02 


-0.01 


-0.01 


0.01 


-0.15 


-0.15 


-0.15 


1.5 


-0.21 


-0.21 


-0.18 


-0.16 


-0.16 


-0.13 


-0.37 


-0.36 


.0.36 


2.0 


-0.27 


-0.31 


-0.26 


-0.21 


.0.24 


-0.20 


-0.49 


-0.51 


-0.52 


2.5 


-0.70 


-0.71 


-0.66 


-0.62 


-0.62 


-0.58 


-0.94 


-0.91 


-0.91 



Table 13: Bias Results for Ability from 100-Exaininee^5-Uem Data Set 



Joint Bayesian-1 Joint Ba3resaaa-2 Marginal Bayesian 



B Level 




Or 












or 




-2.5 


0.15 


0.15 


0.09 


0.10 


0.11 


0.03 


0.28 


0.27 


0.26 


-2.0 


0.08 


0.09 


0.03 


0.05 


0.06 


0.00 


0.20 


0.18 


0.18 


-1.5 


0.05 


0.03 


-0.01 


0.00 


0.01 


-0.04 


0.13 


0.11 


0.11 


4.0 


0.00 


-0.01 


-0.04 


-0.03 


.0.02 


-0.05 


0.05 


0.04 


0.04 


-0.5 


-0.02 


-0.02 


-0.04 


-0.03 


-0.03 


-0.05 


0.01 


0.00 


0.00 


0.0 


-0.03 


-0.03 


-0.03 


-0.03 


-0.03 


-0.03 


-0.03 


-0.03 


-0.03 


0.5 


0.02 


0.02 


0.04 


0.03 


0.03 


0.04 


0.00 


0.01 


0.00 


1.0 


-0.01 


0.00 


0.03 


0.03 


0.02 


0.05 


-0.05 


-0.03 


-0.04 


1.5 


-O.iO 


-0.09 


-0.05 


-0.06 


-0.07 


-0.03 


-0.16 


-0.15 


-0.15 


2.0 


-0.10 


-0.11 


-0.05 


-0.07 


-0.08 


-0.02 


-0.19 


-0.19 


-0.19 


2.5 


-0.27 


-0.24 


-0.18 


-0.18 


-0.18 


-0.10 


-0.35 


-0.33 


-0.32 
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Table 14: Bias Results for Ability from 300- Examinee- IS- Item Data Set 



Joint Bayesian-1 Joint Ba]re8ian-2 Marginal Bayesian 



0 Level 




Or 






Or 










-2.5 


0.65 


0.57 


0.55 


0.57 


0.53 


0.51 


0.86 


0.85 


0.86 


-2.0 


0.37 


0.33 


0.31 


0.33 


0.30 


0.29 


0.59 


0.59 


0.60 


-1.5 


0.10 


0.09 


0.07 


0.08 


0.06 


0.05 


0.31 


0.31 


0.32 


-1.0 


0.02 


0.03 


0.01 


0.02 


0.01 


0.00 


0.18 


0.17 


0.18 


-0.5 


-0.01 


0.01 


0.00 


0.01 


0.00 


0.00 


0.09 


0.09 


0.09 


0.0 


0.02 


0.04 


0.04- 


0.05 


0.04 


0.04 


0.04 


0.04 


0.04 


0.5 


-0.02 


-0.02 


-0.01 


0.00 


-0.01 


0.00 


-0.10 


-0.09 


-0.09 


1.0 


-0.10 


-0.08 


-0.07 


-0.07 


-0.07 


-0.06 


-0.23 


-0.22 


-0.22 


1.5 


-0.15 


-0.13 


-0.11 


-0.12 


-0.10 


-0.09 


-0.34 


-0.33 


-0.33 


2.0 


-0.28 


•0.25 


-0.23 


■0.24 


-0.22 


-0.20 


-0.53 


-0.52 


-0.52 


2.5 


-0.51 


-0.47 


-0.44 


-0.47 


-0.43 


-0.41 


-0.80 


-0.79 


-0.80 
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Table 15: Bias Results for Ability from 300- £xaixunee-45- Item Data Set 



Joint Bayesian-1 Joint Bayesian'2 Marginal Bayesian 





"L 




apT 


«L 






«L 


WT 


"AT 


-2.5 


0.10 


0.08 


0.05 


0.06 


0.05 


0.02 


0.26 


0.23 


0.23 


-2.0 


0.11 


0.11 


0.08 


0.09 


0.08 


0.06 


0.23 


0.22 


0.22 


-1.5 


0.08 


0.07 


0.05 


0.06 


0.05 


0.03 


0.17 


0.15 


0.15 


-1.0 


0.06 


0.05 


0.03 


0.04 


0.03 


0.02 


0.11 


0.10 


0.10 


-0.5 


0.01 


0.01 


0.00 


0.01 


0.01 
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